On Clustering Using Random Walks
نویسندگان
چکیده
We propose a novel approach to clustering, based on deterministic analysis of random walks on the weighted graph associated with the clustering problem. The method is centered around what we shall call separating operators, which are applied repeatedly to sharpen the distinction between the weights of inter-cluster edges (the so-called separators), and those of intra-cluster edges. These operators can be used as a stand-alone for some problems, but become particularly powerful when embedded in a classical multi-scale framework and/or enhanced by other known techniques, such as agglomerative clustering. The resulting algorithms are simple, fast and general, and appear to have many useful applications.
منابع مشابه
Faster Clustering via Non-Backtracking Random Walks
This paper presents VEC-NBT, a variation on the unsupervised graph clustering technique VEC, which improves upon the performance of the original algorithm significantly for sparse graphs. VEC employs a novel application of the state-ofthe-art word2vec model to embed a graph in Euclidean space via random walks on the nodes of the graph. In VEC-NBT, we modify the original algorithm to use a non-b...
متن کاملLanguage Model-Based Document Clustering Using Random Walks
We propose a new document vector representation specifically designed for the document clustering task. Instead of the traditional termbased vectors, a document is represented as an -dimensional vector, where is the number of documents in the cluster. The value at each dimension of the vector is closely related to the generation probability based on the language model of the corresponding docum...
متن کاملRandom Walks and Evolving Sets: Faster Convergences and Limitations
Analyzing the mixing time of random walks is a well-studied problem with applications in random sampling and more recently in graph partitioning. In this work, we present new analysis of random walks and evolving sets using more combinatorial graph structures, and show some implications in approximating small-set expansion. On the other hand, we provide examples showing the limitations of using...
متن کاملGenerating Scale-free Networks with Adjustable Clustering Coefficient Via Random Walks
This paper presents an algorithm for generating scale-free networks with adjustable clustering coefficient. The algorithm is based on a random walk procedure combined with a triangle generation scheme which takes into account genetic factors; this way, preferential attachment and clustering control are implemented using only local information. Simulations are presented which support the validit...
متن کاملRandom Walks and Evolving Sets: Faster Convergences and Limitations | Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms | Society for Industrial and Applied Mathematics
Analyzing the mixing time of random walks is a wellstudied problem with applications in random sampling and more recently in graph partitioning. In this work, we present new analysis of random walks and evolving sets using more combinatorial graph structures, and show some implications in approximating small-set expansion. On the other hand, we provide examples showing the limitations of using ...
متن کامل